Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Efficient Inference #47

Merged
merged 171 commits into from
Nov 20, 2024
Merged

Conversation

elvircrn
Copy link
Collaborator

@elvircrn elvircrn commented Sep 11, 2024

This PR adds support for the following:

  • Efficient SPQR CUDA-based matvec kernel implementation for a subset of paramaters
  • Integration of said kernel for end-to-end inference
  • Kernel benchmarks
  • End-to-end inference demo and benchmarks

@elvircrn elvircrn marked this pull request as draft September 12, 2024 13:59
.gitignore Outdated Show resolved Hide resolved
inference_demo.py Outdated Show resolved Hide resolved
spqr/mul_ops.py Outdated Show resolved Hide resolved
spqr/profile_spqr.py Outdated Show resolved Hide resolved
@elvircrn elvircrn requested review from Vahe1994 and removed request for Vahe1994 November 20, 2024 12:19
Copy link
Owner

@Vahe1994 Vahe1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thank you!

@elvircrn elvircrn merged commit 8f02edd into Vahe1994:main Nov 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants